One of the main tasks of a LINQ query is to restrict
the results from a larger collection based on some criteria. This is
achieved using the Where operator,
which tests each element within a source collection and returns only
those elements that return a true result when tested against a given
predicate expression. A predicate
is simply an expression that takes an element of the same type of the
items in the source collection and returns true or false. This predicate
is passed to the Where clause using a lambda expression.
The extension method for the Where operator is surprisingly simple; it iterates the source collection using a foreach loop, testing each element as it goes, returning those that pass. Here is a close facsimile of the actual code in the System.Linq library:
public delegate TResult Func<T1, TResult>(T1 arg1);
public static IEnumerable<T> Where<T>(
this IEnumerable<T> source,
Func<T, bool> predicate) {
foreach (T element in source) {
if (predicate(element))
yield return element;
}
}
The LINQ to Objects Where operator seems pretty basic on the surface, but its implementation is simple due to the powerful yield return
statement that first appeared in the .NET Framework 2.0 to make
building collection iterators easier. Any code implementing the built-in
enumeration pattern (as codified by any collection that implements the
interface IEnumerable)
natively supports callers asking for the next item in a collection—at
which time the next item for return is computed (supported by the foreach keyword as an example). Any collection implementing the IEnumerable<T> pattern (which also implements IEnumerable) will be extended by the Where
operator, which will return a single element at a time when asked, as
long as that element satisfies the predicate expression (returns true).
Filter predicate
expressions are passed to the extension method using a lambda expression, although if the query expression syntax
is used, the filter predicate takes an even cleaner form. Both of these
predicate expression styles are explained and covered in detail in the
following sections.
Where Filter Using a Lambda Expression
When forming a predicate for the Where
operator, the predicate takes an input element of the same type as the
elements in the source collection and returns true or false (a Boolean
value). To demonstrate using a simple Where clause predicate, consider the following code:
string[] animals = new string[] { "Koala", "Kangaroo",
"Spider", "Wombat", "Snake", "Emu", "Shark",
"Sting-Ray", "Jellyfish" };
var q = animals.Where(
a => a.StartsWith("S") && a.Length > 5);
foreach (string s in q)
Console.WriteLine(s);
In this code, each string value from the animals array is passed to the Where extension method in a range variable called a. Each string in a
is evaluated against the predicate function, and only those strings
that pass (return true) are returned in the query results. For this
example, only two strings pass the test and are written to the console
window. They are
The
C# compiler behind the scenes converts the lambda expression into a
standard anonymous method call (the following code is functionally
equivalent):
var q = animals.Where(
delegate(string a) {
return a.StartsWith("S") && a.Length > 5; });
The Where clause will only begin testing the predicate when somebody (you through a foreach statement or one of the other standard query operators that have an internal foreach
statement) tries to iterate through the results; until then, the
iterator framework just remembers exactly where it was the last time it
was asked for an element. This is called deferred execution, and it
allows you some predictability and control over when and how a query is
executed. If you want results immediately you can call ToList(), ToArray()
or one of the other standard operators that cause immediate
actualization of the results to another form; otherwise, the query will
begin evaluation only when you begin iterating over it.
|
Where Filter Query Expressions (Preferred)
The query expression where clause syntax drops the explicit range variable definition and the lambda expression operator (=>),
making it more concise and more familiar to the SQL-style clauses that
many developers understand. It is the preferred syntax for these
reasons. Rewriting the previous example using query expression syntax
demonstrates these differences, as follows:
string[] animals = new string[] { "Koala", "Kangaroo",
"Spider", "Wombat", "Snake", "Emu", "Shark",
"Sting-Ray", "Jellyfish" };
var q = from a in animals
where a.StartsWith("S") && a.Length > 5
select a;
foreach (string s in q)
Console.WriteLine(s);
Using an External Method for Evaluation
Although
you can write queries and inline the code for the filter predicate, you
don’t have to. If the predicate is lengthy and might be used in more
than one query expression, you should consider putting it in its own
method body (good practice for any duplicated code). Rewriting the
previous examples using an external predicate function shows the
technique:
string[] animals = new string[] { "Koala", "Kangaroo",
"Spider", "Wombat", "Snake", "Emu", "Shark",
"Sting-Ray", "Jellyfish" };
var q = from a in animals
where MyPredicate(a)
select a;
foreach (string s in q)
Console.WriteLine(s);
public bool MyPredicate(string a)
{
if (a.StartsWith("S") && a.Length > 5)
return true;
else
return false;
}
To further demonstrate this technique with a slightly more complex scenario, the code shown in Listing 1
creates a predicate method that encapsulates the logic for determining
“a deadly animal.” By encapsulating this logic in one method, it doesn’t
have to be duplicated in multiple places in an application.
Listing 1. Where clause using external method—see Output 1
string[] animals = new string[] { "Koala", "Kangaroo", "Spider", "Wombat", "Snake", "Emu", "Shark", "Sting-Ray", "Jellyfish" };
var q = from a in animals where IsAnimalDeadly(a) select a;
foreach (string s in q) Console.WriteLine("A {0} can be deadly.", s);
public static bool IsAnimalDeadly(string s) { string[] deadly = new string[] {"Spider", "Snake", "Shark", "Sting-Ray", "Jellyfish"};
return deadly.Contains(s); }
|
Output 1.
A Spider can be deadly. A Snake can be deadly. A Shark can be deadly. A Sting-Ray can be deadly. A Jellyfish can be deadly.
|
Filtering by Index Position
The standard query operators expose a variation of the Where
operator that surfaces the index position of each collection element as
it progresses. The zero-based index position can be passed into a
lambda expression predicate by assigning a variable name as the second
argument (after the element range variable). To surface the index
position, a lambda expression must be used, and this can only be
achieved using the extension method syntax. Listing 2
demonstrates the simplest usage, in this case simply returning the
first and only even-index positioned elements from the source
collection, as shown in Output 2.
Listing 2. The index position can be used as part of the Where clause predicate expression when using lambda expressions—see Output 2
string[] animals = new string[] { "Koala", "Kangaroo", "Spider", "Wombat", "Snake", "Emu", "Shark", "Sting-Ray", "Jellyfish" };
// get the first then every other animal (index is odd) var q = animals.Where((a, index) => index % 2 == 0);
foreach (string s in q) Console.WriteLine(s);
|
Output 2.
Koala
Spider
Snake
Shark
Jellyfish